reviewer 1
Figure 1: Protein with random forest across 140 evaluations with different NN structure for distGP's
Thank you for all the reviewers time and effort. Thank you for your detailed review. Here, the idea is to re-train our model when new data is available. Here we explain our design space (see additional details in Appendix A.3, B and C); (i) Choice of embedding (joint vs Reviewer 3 Thank you for your review, and for comments regarding experiments, please see above. Thank you for your positive comments regarding the quality of the paper.
Reviewer 1: Unclear about the evaluation for outer iterations; Does the number of aggregated tasks affect
Y es, the total complexity is proportional to the number of aggregated tasks. Add experiments to compare ANIL and MAML and w.r .t. the size B of samples: Why sample size in inner-loop is not taken into analysis, as Fallah et al. [4] does: This setting has also been considered in Rajeswaran et al. [24], Ji et al. [13]. Reviewer 2: Dependence on κ. iMAML depends on κ in contrast to poly (κ) of this work: Add an experiment to verify the tightness: Great point! W e will definitely add such an experiment in the revision. W e will clarify it in the revision.
Figure 1: Evaluation with added comparison to PEARL, showing meta-training curves on full state pushing (left), ant locomotion
However, GMPS significantly outperforms PEARL on sparse reward tasks. GMPS is better able to learn out-of-distribution tasks. Ablation for number of consecutive outer updates, as requested by reviewer 3. Using 500 imitation steps (blue) We thank the reviewers for their positive and constructive feedback. The primary concern from Reviewer 1 was the comparison to PEARL (Rakelly et al.). Reviewer 1. See PEARL comparisons above.
that our implementation will be a widely used tool for embedding convex optimization problems in end-to-end learning
We thank the reviewers for their constructive feedback on our paper. We especially appreciate our reviewers' conviction Reviewers 1 and 2 found some of our explanations of ASA form and DPP difficult to follow. We will also explain the motivation for our ruleset (reviewer 1's guess is essentially correct). This is what we meant by our vague phrasing "jointly DCP ... [with] one We will separately explain how to reduce certain expressions in which parameters are multiplied together ( e.g., We will clarify this point. In the revision, we will make sure to clearly explain this.